18 research outputs found

    The miniature genome of a carnivorous plant Genlisea aurea contains a low number of genes and short non-coding sequences

    Full text link
    Abstract Background Genlisea aurea (Lentibulariaceae) is a carnivorous plant with unusually small genome size - 63.6Ā Mb ā€“ one of the smallest known among higher plants. Data on the genome sizes and the phylogeny of Genlisea suggest that this is a derived state within the genus. Thus, G. aurea is an excellent model organism for studying evolutionary mechanisms of genome contraction. Results Here we report sequencing and de novo draft assembly of G. aurea genome. The assembly consists of 10,687 contigs of the total length of 43.4Ā Mb and includes 17,755 complete and partial protein-coding genes. Its comparison with the genome of Mimulus guttatus, another representative of higher core Lamiales clade, reveals striking differences in gene content and length of non-coding regions. Conclusions Genome contraction was a complex process, which involved gene loss and reduction of lengths of introns and intergenic regions, but not intron loss. The gene loss is more frequent for the genes that belong to multigenic families indicating that genetic redundancy is an important prerequisite for genome size reduction.http://deepblue.lib.umich.edu/bitstream/2027.42/112458/1/12864_2013_Article_5207.pd

    Comparative genome analysis of Pseudogymnoascus spp. reveals primarily clonal evolution with small genome fragments exchanged between lineages

    Get PDF
    Abstract Background Pseudogymnoascus spp. is a wide group of fungi lineages in the family Pseudorotiaceae including an aggressive pathogen of bats P. destructans. Although several lineages of P. spp. were shown to produce ascospores in culture, the vast majority of P. spp. demonstrates no evidence of sexual reproduction. P. spp. can tolerate a wide range of different temperatures and salinities and can survive even in permafrost layer. Adaptability of P. spp. to different environments is accompanied by extremely variable morphology and physiology. Results We sequenced genotypes of 14 strains of P. spp., 5 of which were extracted from permafrost, 1 from a cryopeg, a layer of unfrozen ground in permafrost, and 8 from temperate surface environments. All sequenced genotypes are haploid. Nucleotide diversity among these genomes is very high, with a typical evolutionary distance at synonymous sites dSā€‰ā‰ˆā€‰0.5, suggesting that the last common ancestor of these strains lived >50Mya. The strains extracted from permafrost do not form a separate clade. Instead, each permafrost strain has close relatives from temperate environments. We observed a strictly clonal population structure with no conflicting topologies for ~99% of genome sequences. However, there is a number of short (~100ā€“10,000 nt) genomic segments with the total length of 67.6 Kb which possess phylogenetic patterns strikingly different from the rest of the genome. The most remarkable case is a MAT-locus, which has 2 distinct alleles interspersed along the whole-genome phylogenetic tree. Conclusions Predominantly clonal structure of genome sequences is consistent with the observations that sexual reproduction is rare in P. spp. Small number of regions with noncanonical phylogenies seem to arise due to some recombination events between derived lineages of P. spp., with MAT-locus being transferred on multiple occasions. All sequenced strains have heterothallic configuration of MAT-locus.http://deepblue.lib.umich.edu/bitstream/2027.42/111733/1/12864_2015_Article_1570.pd

    CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation

    Get PDF
    Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ā€˜grammarā€™, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila

    KBase: The United States Department of Energy Systems Biology Knowledgebase.

    Get PDF

    Probing-directed identification of novel structured RNAs

    No full text
    <p>Transcripts often harbor RNA elements, which regulate cell processes co- or post-transcriptionally. The functions of many regulatory RNA elements depend on their structure, thus it is important to determine the structure as well as to scan genomes for structured elements. State of the art <i>ab initio</i> approaches to predict structured RNAs rely on DNA sequence analysis. They use 2 major types of information inferred from a sequence: thermodynamic stability of an RNA structure and evolutionary footprints of base-pair interactions. In recent years, chemical probing of RNA has arisen as an alternative source of structural information. RNA probing experiments detect positions accessible to specific types of chemicals or enzymes indicating their propensity to be in a paired or unpaired state. There exist several strategies to integrate probing data into RNA secondary structure prediction algorithms that substantially improve the prediction quality. However, whether and how probing data could contribute to detection of structured RNAs remains an open question. We previously developed the energy-based approach RNASurface to detect locally optimal structured RNA elements. Here, we integrate probing data into the RNASurface energy model using a general framework. We show that the use of experimental data allows for better discrimination of ncRNAs from other transcripts. Application of RNASurface to genome-wide analysis of the human transcriptome with PARS data identifies previously undetectable segments, with evidence of functionality for some of them.</p
    corecore